On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs
نویسندگان
چکیده
Graphical models, such as Bayesian networks and Markov random elds, represent statistical dependencies of variables by a graph. The max-product \belief propagation" algorithm is a local-message passing algorithm on this graph that is known to converge to a unique xed point when the graph is a tree. Furthermore, when the graph is a tree, the assignment based on the xed-point yields the most probable a posteriori (MAP) values of the unobserved variables given the observed ones. Recently, good empirical performance has been obtained by running the max-product algorithm (or the equivalent min-sum algorithm) on graphs with loops, for applications including the decoding of \turbo" codes. Except for two simple graphs (cycle codes and single loop graphs) there has been little theoretical understanding of the max-product algorithm on graphs with loops. Here we prove a result on the xed points of max-product on a graph with arbitrary topology and with arbitrary probability distributions (discrete or continuous valued nodes). We show that the assignment based on a xed-point is a \neighborhood maximum" of the posterior probability: the posterior probability of the max-product assignment is guaranteed to be greater than all other assignments in a particular large region around that assignment. The region includes all assignments that di er from the max-product assignment in any subset of nodes that form no more than a single loop in the graph. In some graphs this neighborhood is exponentially large. We illustrate the analysis with examples. Problems involving probabilistic belief propagation arise in a wide variety of applications, including error correcting codes, speech recognition and image understanding. Typically, a probability distribution is assumed over a set of variables and the task is to infer the values of the unobserved variables given the observed ones. The assumed probability distribution is described using a graphical model [13] | the qualitative aspects of the distribution are speci ed by a graph structure. The graph may either be directed as in a Bayesian network [17], [11] or undirected as in a Markov Random Field [17], [9]. Here we focus on the problem of nding an assignment for the unobserved variables that is most probable given the observed ones. In general, this problem is NP hard [18] but if the graph is singly connected (i.e. there is only one path between any two given nodes) then there exist e cient local message{passing schemes to perform this task. Pearl [17] derived such a scheme for singly connected Bayesian networks. The algorithm, which he called \belief revision", is identical to his algorithm for nding posterior marginals over nodes except that the summation operator is replaced with a maximization. Aji et al. [2] have shown that both of Pearl's algorithms can be seen as special cases of generalized distributive laws over particular semirings. In particular, Pearl's algorithm for nding maximum a posteriori (MAP) assignments can be seen as a generalized distributive law over the max-product semiring. We will henceforth refer to it as the \max-product" algorithm. Pearl showed that for singly connected networks, the max-product algorithm is guaranteed to converge and that the assignment based on the messages at convergence is guaranteed to give the optimal assignment{values corresponding to the MAP solution. Several groups have recently reported excellent experimental results by running the max-product algorithm on graphs with loops [22], [6], [3], [19], [6], [10]. Benedetto et al. used the max-product algorithm to decode \turbo" codes and obtained excellent results that were slightly inferior to the original turbo decoding algorithm (which is equivalent to the sum-product algorithm). Weiss [19] compared the performance of sum-product and max-product on a \toy" turbo code problem while distinguishing between converged and unconverged cases. He found that if one considers only the convergent cases, the performance of max-product decoding is signi cantly better than sum-product decoding. However, the max-product algorithm converges less often so its overall performance (including both convergent and nonconvergent cases) is inferior. Progress in the analysis of the max-product algorithm has been made for two special topologies: single loop graphs, and \cycle codes". For graphs with a single loop [22], [19], [20], [5], [2], it can be shown that the algorithm converges to a stable xed point or a periodic oscillation. If it converges to a stable xed-point, then the assignment based on the xed-point messages is the optimal assignment. For graphs that correspond to cycle codes (low density parity check codes in which each bit is checked by exactly two check nodes), Wiberg [22] gave su cient conditions for max-product to converge to the transmitted codeword and Horn [10] gave su cient conditions for convergence to the MAP assignment. In this paper we analyze the max-product algorithm in graphs of arbitrary topology. We show that at a xed-point of the algorithm, the assignment is a \neighborhood maximum" of the posterior probability: the posterior probability of the max-product assignment is guaranteed to be greater than all other assignments in a particular large region around that assignment. These results motivate using this powerful algorithm in a broader class of networks. Y. Weiss is with Computer Science Division, 485 Soda Hall, UC Berkeley, Berkeley, CA 94720-1776. E-mail: [email protected]. Support by MURI-ARO-DAAH04-96-1-0341, MURI N00014-00-1-0637 and NSF IIS-9988642 is gratefully acknowledged. W. T. Freeman is with MERL, Mitsubishi Electric Research Labs., 201 Broadway, Cambridge, MA 02139. E-mail: [email protected].
منابع مشابه
Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology
Graphical models, such as Bayesian networks and Markov random elds represent statistical dependencies of variables by a graph. Local \belief propagation" rules of the sort proposed by Pearl [20] are guaranteed to converge to the correct posterior probabilities in singly connected graphs. Recently good performance has been obtained by using these same rules on graphs with loops, a method known a...
متن کاملLinear Response for Approximate Inference
Belief propagation on cyclic graphs is an efficient algorithm for computing approximate marginal probability distributions over single nodes and neighboring nodes in the graph. In this paper we propose two new algorithms for approximating joint probabilities of arbitrary pairs of nodes and prove a number of desirable properties that these estimates fulfill. The first algorithm is a propagation ...
متن کاملTree consistency and bounds on the performance of the max-product algorithm and its generalizations
Finding the maximum a posteriori (MAP) assignment of a discrete-state distribution specified by a graphical model requires solving an integer program. The max-product algorithm, also known as the max-plus or min-sum algorithm, is an iterative method for (approximately) solving such a problem on graphs with cycles. We provide a novel perspective on the algorithm, which is based on the idea of re...
متن کاملFast Inference and Learning with Sparse Belief Propagation
Even in trees, exact probabilistic inference can be expensive when the cardinality of the variables is large. This is especially troublesome for learning, because many standard estimation techniques, such as EM and conditional maximum likelihood, require calling an inference algorithm many times. In max-product inference, a standard heuristic for controlling this complexity in linear chains is ...
متن کاملMerl a Mitsubishi Electric Research Laboratory Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology
Local \belief propagation" rules of the sort proposed by Pearl [12] are guaranteed to converge to the correct posterior probabilities in singly connected graphical models. Recently, a number of researchers have empirically demonstrated good performance of \loopy belief propagation"{using these same rules on graphs with loops. Perhaps the most dramatic instance is the near Shannonlimit performan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Information Theory
دوره 47 شماره
صفحات -
تاریخ انتشار 2001